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Abstract 

We investigate hypergraphic LP relaxations for the Steiner tree problem, primarily the par- 
tition LP relaxation introduced by Konemann et al. [Math. Programming, 2009]. Specifically, 
we are interested in proving upper bounds on the integrality gap of this LP, and studying its 
relation to other linear relaxations. Our results are the following. 

Structural results: We extend the technique of uncrossing, usually applied to families of 
sets, to families of partitions. As a consequence we show that any basic feasible solution to 
the partition LP formulation has sparse support. Although the number of variables could be 
exponential, the number of positive variables is at most the number of terminals. 

Relations with other relaxations: We show the equivalence of the partition LP relaxation 
with other known hypergraphic relaxations. We also show that these hypergraphic relaxations 
are equivalent to the well studied bidirected cut relaxation, if the instance is quasibipartite. 

Integrality gap upper bounds: Wc show an upper bound of VS = 1.729 on the integrality gap 
of these hypergraph relaxations in general graphs. In the special case of uniformly quasibipartite 
instances, we show an improved upper bound of 73/60 = 1.216. By our equivalence theorem, 
the latter result implies an improved upper bound for the bidirected cut relaxation as well. 

1 Introduction 

In the Steiner tree problem, we are given an undirected graph G = {V,E), non-negative costs Cg 
for all edges e £ E, and a set of terminal vertices R '^V. The goal is to find a minimum-cost tree 
T spanning R, and possibly some Steiner vertices from V \ R. We can assume that the graph is 
complete and that the costs induce a metric. The problem takes a central place in the theory of 
combinatorial optimization and has numerous practical applications. Since the Steiner tree problem 
is NP-hard^ we are interested in approximation algorithms for it. The best published approximation 
algorithm for the Steiner tree problem is due to Robins and Zelikovsky [29], which for any fixed 



e > 0, achieves a performance ratio of 1 + ^ + e = 1.55 in polynomial time; an improvement is 
currently in press [3], see also Remark 1.1. 

In this paper, we study linear programming (LP) relaxations for the Steiner tree problem, and 
their properties. Numerous such formulations are known (e.g., see [1, 7, 8, 10, 11, 18, 24, 25, 35, 36]), 
and their study has led to impressive running time improvements for integer programming based 
methods. Despite the significant body of work in this area, none of the known relaxations is known 
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^Chlebik and Chlebfkova show that no (96/95 — e)-approximation algorithm can exist for any positive e unless 
P=NP [5]. 
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to exhibit an integrality gap provably smaller^ than 2. The integrahty gap of a relaxation is the 
maximum ratio of the cost of integral and fractional optima, over all instances. It is commonly 
regarded as a measure of strength of a formulation. One of the contributions of this paper are 
improved bounds on the integrality gap for a number of Steiner tree LP relaxations. 

A Steiner tree relaxation of particular interest is the bidirected cut relaxation [11, 36] (precise 
definitions will follow in Section 1.2). This relaxation has a flow formulation using 0(|ii^||i?|) 
variables and constraints, which is much more compact than the other relaxations we study. Also, 
it is also widely believed to have an integrality gap significantly smaller than 2 (e.g., see [4, 28, 34]). 
The largest lower bound on the integrality gap known is 8/7 (by Martin Skutella, reported in [23]), 
and Chakrabarty et al. [4] prove an upper bound of 4/3 in so called quasi-bipartite instances (where 
Steiner vertices form an independent set). 

Another class of formulations are the so called hypergraphic LP relaxations for the Steiner tree 
problem. These relaxations are inspired by the observation that the minimum Steiner tree problem 
can be encoded as a minimum cost hyper-spanning tree (see Section 1.2.2) of a certain hypergraph 
on the terminals. They are known to be stronger than the bidirected cut relaxation [26], and it 
is therefore natural to try to use them to get better approximation algorithms, by drawing on the 
large corpus of known LP techniques. In this paper, we focus on one hypergraphic LP in particular: 
the partition LP of Konemann et al. [23]. 

1.1 Our Results and Techniques 

There are three classes of results in this paper: structural results, equivalence results, and integrality 
gap upper bounds. 

Structural results, Section 2: We extend the powerful technique of uncrossing, traditionally 
applied to families of sets, to families of partitions. Set uncrossing has been very successful in 
obtaining exact and approximate algorithms for a variety of problems (for instance, [13, 21, 31]). 
Using partition uncrossing, we show that any basic feasible solution to the partition LP has at most 
(|i?| — 1) positive variables (even though it can have an exponentially large number of variables and 
constraints). 

Equivalence results. Section 3: In addition to the partition LP, two other hypergraphic LPs 
have been studied before: one based on subtour elimination due to Warme [35], and a directed 
hypergraph relaxation of Polzin and Vahdati Daneshmand [26] ; these two are known to be equivalent 
[26]. We prove that in fact all three hypergraphic relaxations are equivalent (that is, they have the 
same objective value for any Steiner tree instance). We give two proofs (for completeness and to 
demonstrate our new techniques), one showing the equivalence of the partition LP and the subtour 
LP via partition uncrossing, and one showing the equivalence of the partition LP to the directed 
LP via hypergraph orientation results of Frank et al. [14]. 

We also show that, on quasibipartite instances, the hypergraphic and the bidirected cut LP re- 
laxations are equivalent. We find this surprising for the following reasons. Firstly, some instances 
are known where the hypergraph relaxations is strictly stronger than the bidirected cut relax- 
ation [26]. Secondly, the bidirected cut relaxations seems to resist uncrossing techniques; e.g. even 
in quasi-bipartite graphs extreme points for bidirected cut can have as many as ^i\V\^) positive 
variables [27, Sec. 4.9]. Thirdly, the known approaches to exploiting the bidirected cut relaxation 
(mostly primal-dual and local search algorithms [28, 4]) are very different from the combinatorial 
hypergraphic algorithms for the Steiner tree problem (almost all of them employ greedy strategies) . 

•^Achieving an integrality gap of 2 is relatively easy for most relajcations by showing that the minimum spanning 
tree restricted on the terminals is within a factor 2 of the LP. 
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In short, there is no quahtative similarity to suggest why the two relaxations should be equivalent! 
We believe a better understanding of the bidirected cut relaxation is important because it is central 
in theory and practical for implementation. 

Improved integrality gap upper bounds, Section 4: For uniformly quasibipartite instances 
(quasibipartite instances where for each Steiner vertex, all incident edges have the same cost), we 
show that the integrality gap of the hypergraphic LP relaxations is upper bounded by 73/60 = 1.216. 
Our proof uses the approximation algorithm of Gropl et al. [20] which achieves the same ratio with 
respect to the (integral) optimum. We show, via a simple dual fitting argument, that this ratio is 
also valid with respect to the LP value. To the best of our knowledge this is the only nontrivial 
class of instances where the best currently known approximation ratio and integrality gap upper 
bound are the same. 

For general graphs, we give simple upper bounds of 2^/2 - 1 = 1.83 and VS = 1.729 on the 
integrality gap of the hypergraph relaxation. Call a graph gainless if the minimum spanning tree 
of the terminals is the optimal Steiner tree. To obtain these integrality gap upper bounds, we use 
the following key property of the hypergraphic relaxation which was implicit in [23]: on gainless 
instances (instances where the optimum terminal spanning tree is the optimal Steiner tree), the LP 
value equals the minimum spanning tree and the integrality gap is 1. Such a theorem was known 
for quasibipartite instances and the bidirected cut relaxation (implicitly in [28], explicitly in [4]); 
we extend techniques of [4] to obtain improved integrality gaps on all instances. 

Remark 1.1. The recent independent work of Byrka et al. [3], which gives an improved approx- 
imation for Steiner trees in general graphs, also shows an integrality gap bound of 1.55 on the 
hypergraphic directed cut LP. This is stronger than our integrality gap bounds and was obtained 
prior to the completion of our paper; yet we include our bounds because they are obtained using 
fairly different methods which might be of independent interest in certain settings. 

The proof in [3] can be easily modified to show an integrality gap upper bound of 1.28 in quasibi- 
partite instances. Then using our equivalence result, we get an integrality gap upper bound of 1.28 
for the bidirected cut relaxation on quasibipartite instances, improving the previous best of 4/3. 

1.2 Bidirected Cut and Hypergraphic Relaxations 
1.2.1 The Bidirected Cut Relaxation 

The first bidirected LP was given by Edmonds [11] as an exact formulation for the spanning tree 
problem. Wong [36] later extended this to obtain the bidirected cut relaxation for the Steiner tree 
problem, and gave a dual ascent heuristic based on the relaxation. For this relaxation, introduce 
two arcs (n, v) and {v, u) for each edge uv £ E, and let both of their costs be c„„. Fix an arbitrary 
terminal r £ R as the root. Call a subset U QV valid if it contains a terminal but not the root, and 
let valid(V) be the family of all valid sets. Clearly, the in-tree rooted at r (the directed tree with 
all vertices but the root having out-degree exactly 1) of a Steiner tree T must have at least one arc 
with tail in U and head outside U, for all valid U. This leads to the bidirected cut relaxation (B) 
(shown in Figure 1 on page 4 with dual) which has a variable for each arc a £ A, and a constraint 
for every valid set U. Here and later, (^°"*(C/) denotes the set of arcs in A whose tail is in U and 
whose head lies in V \ U. When there are no Steiner vertices, Edmonds' work [11] implies this 
relaxation is exact. 

Goemans & Myung [18] made significant progress in understanding the LP, by showing that 
the bidirected cut LP has the same value independent of which terminal is chosen as the root, and 
by showing that a whole "catalogue" of very different-looking LPs also has the same value; later 
Goemans [17] showed that if the graph is series-parallel, the relaxation is exact. Rajagopalan and 
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Figure 1: The bidirected cut relaxation {B) and its dual {Bd)- 



Vazirani [28] were the first to show a non-trivial integrality gap upper bound of 3/2 on quasibipartite 
graphs; this was subsequently improved to 4/3 by Chakrabarty et al. [4], who gave another alternate 
formulation for (B). 

1.2.2 Hypergraphic Relaxations 

Given a Steiner tree T, a full component of T is a maximal subtree of T all of whose leaves are 
terminals and all of whose internal nodes are Steiner nodes. The edge set of any Steiner tree can 
be partitioned in a unique way into full components by splitting at internal terminals; see Figure 2 
on page 4 for an example. 



Figure 2: Black nodes are terminals and white nodes are Steiner nodes. Left: a Steiner tree for this instance. 
Middle: the Steiner tree's edges are partitioned into full components; there are four full components. Right: 
the hyperedges corresponding to these full components. 

Let fC be the set of all nonempty subsets of terminals (hyperedges). We associate with each 
K £ IC a fixed full component spanning the terminals in K, and let Ck be its cost'^. The problem 
of finding a minimum-cost Steiner tree spanning R now reduces to that of finding a minimum-cost 
hyper-spanning tree in the hypergraph (i?,/C). 

Spanning trees in (normal) graphs are well understood and there are many different exact LP 
relaxations for this problem. These exact LP relaxations for spanning trees in graphs inspire the 
hypergraphic relaxations for the Steiner tree problem. Such relaxations have a variable xk for every^ 
K G IC, and the different relaxations are based on the constraints used to capture a hyper-spanning 
tree, just as constraints on edges are used to capture a spanning tree in a graph. 

The oldest hypergraphic LP relaxation is the subtour LP introduced by Warme [35] which is 
inspired by Edmonds' subtour elimination LP relaxation [12] for the spanning tree polytope. This 
LP relaxation uses the fact that there are no hypercycles in a hyper-spanning tree, and that it is 
spanning. More formally, let p{X) := max(0, |X| — 1) be the rank of a set X of vertices. Then a sub- 
hypergraph (i?, /C') is a hyper-spanning tree iff J2kgK.' Pi^) ~ Pi^) Y^KetC' P(^ H S") < p{S) 

■^We choose the minimum cost full component if there are many. If there is no full component spanning K, we let 
Ck be infinity. Such a minimum cost component can be found in polynomial time, if \K\ is a constant. 

^Observe that there could be exponentially many hyperedges. This computational issue is circumvented by 
considering hyperedges of size at most r, for some constant r. By a result of Borchers and Du [2], this leads to only 
a (1 + 6(l/logr)) factor increase in the optimal Steiner tree cost. 
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for every subset S of R. The corresponding LP relaxation, denoted below as (5), is called the 
suhtour elimination LP relaxation. 



min I ^ Ckxk ■ x G Rf q, Y1 ^kp{K) = p{R), (5) 
KeK. KeK. 

XKp{K r\S)< p{S), V5 c i?} 

Warme showed that if the maximum hyperedge size r is bounded by a constant, the LP can be 
solved in polynomial time. 

The next hypergraphic LP introduced for Steiner tree was a directed hypergraph formulation 
("D), introduced by Polzin and Vahdati Daneshmand [26], and inspired by the bidirected cut relax- 
ation. Given a full component K and a terminal i £ K, let denote the arborescence obtained 
by directing all the edges of K towards i. Think of this as directing the hyperedge K towards i to 
get the directed hyperedge . Vertex i is called the head of while the terminals in i^T \ i are 
the tails of K. The cost of each directed hyperedge is the cost of the corresponding undirected 
hyperedge K. In the directed hypergraph formulation, there is a variable Xj^i for every directed 
hyperedge K^. As in the bidirected cut relaxation, there is a vertex r £ R which is a root, and as 
described above, a subset C/ C of terminals is valid if it does not contain the root but contains 
at least one vertex in R. We let A°"*(C/) be the set of directed full components coming out of U, 
that is all such that U Ci K but i ^ U. Let ^ be the set of all directed hyperedges. We 
show the directed hypergraph relaxation and its dual in Figure 3. 
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U:KnUj^0,i^U 



Figure 3: The directed hypergraph relaxation (P) and its dual (P 
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Polzin & Vahdati Daneshmand [26] showed that OPT (T>) = OPT (5). Moreover they observed 
that this directed hypergraphic relaxation strengthens the bidirected cut relaxation. 

Lemma 1.2 ([26]). For any instance, OPT (P) > OPT (5). 

Proof sketch. It suffices to show that any solution x of (2?) can be converted to a feasible solution 
x' of (B) of the same cost. For each arc a, let x'^ be the sum of Xj^i over all directed full components 
that (when viewed as an arborescence) contain a. Now for any valid subset U of V, it is not 
hard to see that every directed full component leaving RCiU has at least one arc leaving U, hence 
I^aG<5°"M'^) - ^K^eAout^Rnu) > 1 and x' is feasible as needed. □ 

See [26] for an example where the strict inequality OPT (T>) > OPT (B) holds. 

Konemann et al. [23], inspired by the work of Chopra [6], described a partition-based relax- 
ation which captures that given any partition of the terminals, any hyper-spanning tree must 
have sufficiently many "cross hyperedges". More formally, a partition, vr, is a collection of pair- 
wise disjoint nonempty terminal sets (vri, . . . ,7rg) whose union equals R. The number of parts q 
of vr is referred to as the partition's rank and denoted as r(7r). Let II/j be the set of all parti- 
tions of R. Given a partition n = {vri, . . . , Tr^}, define the rank contribution rc^ of hyperedge 
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K £ K. for vr as the rank reduction of tt obtained by merging the parts of tt that are touched 
by K] i.e., rc^ := \{i : K f] iTi ^ 0}\ — 1. Then a hyper-spanning tree (i?,/C') must satisfy 
J2k<^k:' ^^k — '''i'^) ~ ^- '^^^ partition based LP of [23] and its dual are given in Figure 4 on page 



6. 



mm I Ckxk ■ x G R§o {V) 
KeK. 

X] > r-(7r) - 1, Vvren/j} (5) 



KeK. 



max{ J](r(7r)-l)-y, : y G R^^ (Vd) 

7T 

5^ y^rcl < Ck, E /c} (6) 



Figure 4: The unbounded partition relaxation (V) and its dual (Vd)- 



The feasible region of (V) is unbounded, since if x is a feasible solution for (V) then so is any 
x' > X. We obtain a hounded partition LP relaxation, denoted by (P') and shown below, by adding 
a valid equality constraint to the LP. 



{ ^^^^ • ^ ^ (^)' 5^ ^^(1^1 - 1) = 1^1 - (^') 



mm 

Ki^K KeK. 

1.2.3 Discussion of Computational Issues 

The bidirected cut relaxation is very attractive from a perspective of computational implementation. 
Although the formulation given in Section 1.2.1 has an exponential number of constraints, an 
equivalent compact flow formulation with 0(|-E||i?|) variables and constraints is well-known. 

What is known regarding solving the hypergraphic LPs? They are good enough to get theo- 
retical results but less attractive in practice, as we now explain. Using a separation oracle, Warme 
showed [35] that for any chosen family /C of full components, the subtour LP can be optimized in 
time poly(|y|, |/C|). For the common r-restricted setting of /C to be all possible full components 
of size at most r for constant r, we have /C < ('^')- This is polynomial for any fixed r, and the 
relative error caused by this choice of r is at most the r-Steiner ratio pr = 1 + 0(1/ log r) [2]. But 
this is not so practical: to get relative error 1 -|- e, we apply the ellipsoid algorithm to an LP with 
|^|exp{e(i/6)) variables! 

In the unrestricted setting where /C contains all possible full components without regard to size, 
it is an open problem to optimize any of the hypergraphic LPs exactly in polynomial time. We make 
some progress here: in quasibipartite instances, the proof method of our hypergraphic-bidirected 
equivalence theorem (Section 3.3) implies that one can exactly compute the LP optimal value, and 
a dual optimal solution. Regarding this open problem, we note that the r-restricted LP optimum 
is at most pr times the unrestricted optimum, and wonder whether there might be some advantage 
gained by using the fact that the hypergraphic LPs have sparse optima. 

We reiterate our feeling that it is important to obtain practical algorithms and understand 
the bidirected cut relaxation as well as possible, e.g. we know now that it has an integrality gap 
of at most 1.28 on quasi-bipartite instances, but obtaining such a bound directly could give new 
insights. 

1.2.4 Other Related Work 

In the special case of r-restricted instances for r = 3, the partition hypergraphic LP is essentially 
a special case of an LP introduced by Vande Vate [33] for matroid matching, which is totally dual 
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half-integral [16]. Additional facts about the hypergraphic relaxations appear in the thesis of the 
third author [27], e.g. a combinatorial "gainless tree formulation" for the LPs similar in flavour to 
the "1-tree bound" for the Held-Karp TSP relaxation. 

2 Uncrossing Partitions 

In this section we are interested in uncrossing a minimal set of tight partitions that uniquely define 
a basic feasible solution to (V). We start with a few preliminaries necessary to state our result 
formally. 

2.1 Preliminaries 

We introduce some needed well-known properties of partitions that arise in combinatorial lattice 
theory [32]. 

Definition 2.1. We say that a partition vr' refines another partition vr if each part ofn' is contained 
in some part o/vr. We also say vr coarsens vr'. Two partitions cross if neither refines the other. 
A family of partitions forms a chain if no pair of them cross. Equivalently, a chain is any family 
vr^, vr^, . . . , vr* such that vr* refines vr*~^ for each 1 < i < t. 

The family Ur of all partitions of R forms a lattice with a meet operator A : — )• Ur and a 
join operator V : 11^ — )• Hr. The meet vr A vr' is the coarsest partition that refines both vr and vr', 
and the join vr V vr' is the most refined partition that coarsens both vr and vr'. See Figure 5 on page 
7 for an illustration. 

Definition 2.2 (Meet of partitions). Let the parts of tt be vri,...,vrt and let the parts of it' be 
vr^, . . . , vr^. Then the parts of the meet tt Air' are the nonempty intersections of parts of it with parts 

ofTT', 

vr A vr' = {vrj Pi vr^- | I < i < t,l < j < u and tti n vr^ ^ 0}. 

Given a graph G and a partition vr of V{G), we say that G induces it if the parts of vr are the vertex 
sets of the connected components of G. 

Definition 2.3 (Join of partitions). Let {R,E) be a graph that induces vr, and let {R,E') be a 
graph that induces vr'. Then the graph (i?, E U £") induces vr V vr'. 




(a) (b) (c) 

Figure 5: Illustrations of some partitions. The black dots are the terminal set R. (a): two partitions; 
neither refines the other, (b): the meet of the partitions from (a), (c): the join of the partitions from (a). 
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Given a feasible solution x to {V), a partition vr is tight if Y2j^^f^XK^c'^ = r(7r) — 1. Let 
tight (x) be the set of all tight partitions. We are interested in uncrossing this set of partitions. 
More precisely, we wish to find a cross- free set of partitions (chain) which uniquely defines x. One 
way would be to prove the following. 

Property 2.4. // two crossing partitions vr and vr' are in tight(x), then so are tt An' and vr V vr'. 

This type of property is already well-used [9, 13, 21, 31] for sets (with meets and joins replaced 
by unions and intersections respectively), and the standard approach is the following. The typical 
proof considers the constraints in (V) corresponding to tt and vr' and uses the "supermodularity" 
of the RHS and the "submodularity" of the coefficients in the LHS. In particular, if the following 
is true, 

V7r,7r': r(7r V vr') + r(7r A vr') > r(7r) + r(7r') (7) 
yK,TT,7r' : rc^ + rcj; > rcjy^' + rcj-^"' (8) 

then Property 2.4 can be proved easily by writing a string of inequalities.^ 

Inequality (7) is indeed true (see, for example, [32]), but unfortunately inequality (8) is not true 
in general, as the following example shows. 

Example 2.5. Let R = {1, 2, 3, 4}, n = {{1, 2}, {3, 4}} and vr' = {{1, 3}, {2, 4}}. Let K denote the 
full component {1, 2, 3, 4}. Then rc\ + rc^ = l + l<0 + 3 = rc^^'"' + rc^^'"'. 

Nevertheless, Property 2.4 is true; its correct proof is given in Section 2.2 and depends on a 
simple though subtle extension of the usual approach. The crux of the insight needed to fix the 
approach is not to consider pairs of constraints in {V), but rather multi-sets which may contain 
more than two inequalities. Using this uncrossing result, we can prove the following theorem (details 
are given in Section 2.3). Here, we let vr denote {i?}, the unique partition with (minimal) rank 1; 
later we use vf to denote {{r} | r G i?}, the unique partition with (maximal) rank 

Theorem 1. Let x* he a basic feasible solution of (V), and let C be an inclusion-wise maximal 
chain in tight (x*)\7r. Then x* is uniquely defined by 

rc]^x*K = r(7r) - 1 Vvr G C. (9) 

Any chain of distinct partitions of R that does not contain vr has size at most \R\ — 1, and 
this is an upper bound on the rank of the system in (9). Elementary linear programming theory 
immediately yields the following corollary. 

Corollary 2.6. Any basic solution x* of (V) has at most — 1 non-zero coordinates. 

2.2 Partition Uncrossing Inequalities 

We start with the following definition. 

Definition 2.7. Let tt G Hr be a partition and let S C R. Define the merged partition m(7r, S) to 
be the most refined partition that coarsens vr and contains all of S in a single part. See Figure 6 on 
page 9 for an example. Informally, m(TT,S) is obtained by merging all parts of tt which intersect S. 
Formally, m{iT,S) equals the set of parts {{■^j}j:njns=0,ljj:-KjnSf^0'^j}- 
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Figure 6: Illustration of merging. The left figure shows a (solid) partition tt along with a (dashed) 
set S. The right figure shows the merged partition m(7r, S). 



We will use the following straightforward fact later: 

rc^ = r(7r) — r(m(7r, K)). 



(10) 



We now state the (true) inequalities which replace the false inequality (8). Later, we show how 
one uses these to obtain partition uncrossing, e.g. to prove Property 2.4. 

Lemma 2.8 (Partition Uncrossing Inequalities). Let tt, tt' £ and let the parts ofn be vri, 7r2, . . . , vr^(jr) • 

r(7r) 

r(7r) [r(7r') - l] + [r(7r) - 1] = [r(7r A vr') - l] + ^ [r(m(7r', vr^)) - l] (11) 



i=l 



VET e /C : rU) 



rc 



K 



+ 



rc 



K 



> 



r{iT) 



ra(Tv' ,-iVi) 
K 



(12) 



i=l 



Before giving the proof of the above lemma, let us first show how it can be used to prove the 
statement Property 2.4. 



Proof of Property 2.4- Since vr and vr' are tight, 



r(7r)[r(7r') - 1] + [r(7r) - 1] = r(7r) xk^c]^ 

K 

K 



+ 



K K ^ 

K i=l K 

> [r(7r A vr') - l] + ^ [r(m(7r', tTj)) - l] = r(7r) [r(7r') - l] + [r(7r) - 1] 



r(7r) 



i=l 



where the first inequality follows from (12) and the second from (5) (as x is feasible); the last equal- 
ity is (11). Since the first and last terms are equal, all the inequalities are equalities, in particular 



''In this hypothetical scenario we get r(7r) + r{-K') — 2 — J]]^^, 2;_fs-(rcJ- + rcj^) > xk{^cJ^^ + rc]^'^ ) > 
r{-K A tt') + r(7r V vr') — 2 > r{n) + r{n') — 2; thus the inequahties hold with equality, and the middle one shows n An' 
and nV n' are tight. 
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our application of (5) shows that vr A tt' and each m(7r', vTj) is tight. Iterating the latter fact, we see 
that m(- • • m(m(7r', vTi), 7r2), • • • ) = vr V vr' is also tight. □ 



To prove the inequalities in Lemma 2.8 we need the following lemma that relates the rank of sets 
and the rank contribution of partitions. Recall p{X) := max(0, \X\ — 1). 

Lemma 2.9. For a partition vr = {vri, . . . , vrt} of R, where t = r(vr), and for any K R, we have 

t 

1=1 

Proof. By definition, K Dni ^ for exactly 1 + rc^ values of i. Also, p{K D vr,) = for all other 
i. Hence 

^p(i^nvr,)= Yl (l^nvr,|-l)= I Yl li^nvr.l I -(rc^ + 1). (13) 

Observe that Yli-Kn-wij^^z \Kr\T^i\ = \K\ = p{K) + 1; using this fact together with Equation (13) we 
obtain 

Yp{Kr^7,,)= ( Y l^n^il I - {rc^ + i) = p{K) -i + {rc\ + i). 

Rearranging, the proof of Lemma 2.9 is complete. □ 

Proof of Lemma 2.8. First, we argue that n An' = vf holds without loss of generality. In the general 
case, for each part p of vr A vr' with |p| > 2, contract p into one pseudo-vertex and define the new 
K to include the pseudo- vertex corresponding to p if and only if K Cip ^ 0. This contraction does 
not affect the value of any of the terms in Equations (12) and (11), so is without loss of generality. 
After contraction, for any part vrj of vr and part vr^- of vr', we have |vrjnvr^ | < 1, so indeed vr Avr' = vf. 

Proof of Equation (11). Fix i. Since |vrj n vr^| < 1 for all j, the rank contribution rc^^ is equal to 
|vrj| — 1. Then using Equation (10) we know that r(m(vr', vr^)) = r(vr') — |vri| + 1. Thus adding over 
all i, the right-hand side of Equation (11) is equal to 

r{n) 

1^1 - 1 + - Kil) = 1-^1 - 1 + r{TT)r{7r') - \R\ 

i=l 

and this is precisely the left-hand side of Equation (11). □ 
Proof of Equation (12). Fix i. Since |vrj H vr^| < 1 for all j, we have 

rc?^ - rc^(^' > Pin, n K) (14) 

because, when we merge the parts of vr' intersecting vr^, we make K span at least p(vri H K) fewer 
parts. Note that the inequality could be strict if both vTj and K intersect a part of vr' without 
having a common vertex in that part. 

Adding the right-hand side of Equation (14) over all i gives 

r{n) r{7r) 

Yi^ci - rc^(-' -»)) > Y Pi^^ ^K) = p{K) - rc^. (15) 

i=l i=l 
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where the last equahty follows from Lemma 2.9. To finish the proof we observe p{K) = rc^'^', 
since vr A vr' = vT. □ 

This completes the proof of Lemma 2.8. □ 



2.3 Sparsity of Basic Feasible Solutions: Proof of Theorem 1 

Proof. Let supp(x*) be the full components K with x*j^ > 0. Consider the constraint submatrix 
with rows corresponding to the tight partitions and columns corresponding to the full components 
in supp(2;*). Since x* is a basic feasible solution, any full-rank subset of rows uniquely defines x* . 
We now show that any maximal chain C in tight(x*) corresponds to such a subset. 

Let row(7r) e R^"pp{^'*) denote the row corresponding to partition vr of this matrix, i.e., row(7r)/f = 
rc^, and given a collection TZ of partitions (rows), let span(7^) denote the linear span of the rows in 
TZ. We now prove that for any tight partition vr ^ C, we have row(7r) G span(C); this will complete 
the proof of the theorem. 

For sake of contradiction, suppose row(7r) span(C). Choose vr to be the counterexample 
partition with smallest rank r(7r). Firstly, since C is maximal, vr must cross some partition a in C. 
Choose cr to be the most refined partition in C which crosses vr. Let the parts of a be (ui, . . . , at). 
The following claim uses the partition uncrossing inequalities to derive a linear dependence between 
the rows corresponding to a, vr and the partitions formed by merging parts of a with vr. 

Claim 2.10. We have row(cr) + \r{a)\ ■ row(vr) = row(vr A o") + Yll=i row(m(vr, o",)). 

Proof. Since a and vr are both tight partitions, the proof of Property 2.4 shows that the partition 
inequality (12) holds with equality for all K G supp(x*), vr and a, implying the claim. □ 

Let cp^(cr) be the parts of a which intersect at least two parts of vr; i.e., merging the parts of 
vr that intersect fij, for any G cp^((T), decreases the rank of vr. Formally, 

cp^(o-) := {(Tj G : m(vr, CTj) / vr} 

Note that one can modify Claim 2.10 by subtracting (r(cr) — |cp^((j)|)row(vr) from both sides to get 

row((T) + I cp^(cr) I • row(vr) = row(vr A cr) + row(m(vr, cJi)) (16) 

criGcp^(o-) 

Now if row(vr) ^ span(C), we must have either row(vr A a) is not in span(C) or row(m(vr, (jj)) is 
not in span(C) for some i. We show that either case leads to the needed contradiction, which will 
prove the theorem. 

Case 1: row(vr A a) ^ span(C). Note there is o"' G C which crosses vr A a, since vr A a is not in 
the maximal chain C. Since a', a G C and by considering the refinement order, it is easy to 
see that a' (strictly) refines a and a' crosses vr. This contradicts our choice of a as the most 
refined partition in C crossing vr, since a' was also a candidate. 

Case 2: row(m,(vr, (jj)) span(C). Note m{TT,ai) is also tight. Since cjj G cp^(cj), m{TT,ai) has 
smaller rank than vr. This contradicts our choice of vr. 

This completes the proof of Theorem 1 . □ 
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3 Equivalence of Formulations 



In this section we describe our equivalence results. A summary of the known and new results is 
given in Figure 7 on page 12. 



OPT(P) 



[Thm. 2] = [Appendix A] > [Lemma 1.2], [26] 



OPT(-P') 



OPT(P) 



OPT(fi) 



[Thm. 4] 



[26] < in quasi-bipartite [Thm. 5] 



OPT (5) 



Figure 7: Summary of relations among various LP relaxations 

As we mentioned in the introduction, we give a redundant set of proofs for completeness and 
to demonstrate novel techniques. The proof that (V) and (D) have the same value, which appears 
in Appendix A, is a consequence of hypergraph orientation results of Frank et al. [14]. 



3.1 Bounded and Unbounded Partition Relaxations 
Theorem 2. The LPs (V') and (V) have the same optimal value. 
We actually prove a stronger statement. 

Definition 3.1. The collection K, of hyperedges is down-closed if whenever 5 G /C and ^ T <Z S , 
then T €z IC. For down- closed /C, the cost function C : /C — )• R4. is non-decreasing if Cs < Ct 
whenever S C T. 

Theorem 3. // the set of hyperedges is down-closed and the cost function is non- decreasing, then 
(V) and {V) have the same optimal value. 

Theorem 3 implies Theorem 2 since the hypergraph and cost function derived from instances 
of the Steiner tree problem are down-closed and non-decreasing (e.g. Cj^} = for every k & R; 
we remark that the variables x^k} act just as placeholders). Our proof of Theorem 2 relies on the 
following operation which we call shrinking. 

Definition 3.2. Given an assignment x : /C — )• R+ to the full components, suppose xk > for 
some K. The operation Shrink(x, X', (^), where K' C K, \K'\ = \K\ — 1 and < 5 < xk, 
changes x to x' by decreasing x'j^ := xk — S and increasing x'j^, := xk' + S. 

Note that shrinking is defined only for down-closed hypergraphs. Also note that on perform- 
ing a shrinking operation, the cost of the solution cannot increase, if the cost function is non- 
decreasing. The theorem is proved by taking the optimum solution to (V) which minimizes the 
sum X^i^-gyc 2;i^|i^r|, and then showing that this must satisfy the equality in (V'), or a shrinking 
operation can be performed. Now we give the details. 

Proof of Theorem 3. It suffices to exhibit an optimum solution of (V) which satisfies the equality 
in (V'). Let x be an optimal solution to {V) which minimizes the sum J2KeK^K\K\. 
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Claim 3.3. For every K with xk > and for every r ^ K, there exists a tight partition (w.r.t. x) 
TT such that the part of it containing r contains no other vertex of K . 

Proof. Let K' = K \ {r}. If the above is not true, then this implies that for every tight par- 
tition TT, we have rc^ = rc^,. We now claim that there is a 5 > such that we can perform 
Shrink(x, K, K' , 6) while retaining feasibility in {V). This is a contradiction since the shrink oper- 
ation strictly reduces li^l^i^- and doesn't increase cost. Specifically, take 

6 := min{2;i^ , min^:rc- ,^rc- Y^k ^^k^k - r{iT) + 1} 
which is positive since for tight partitions we have rc^ = rc^,. □ 

Let tight(x) be the set of tight partitions, and vr* := /\{vr | vr G tight(x)} the meet of all 
tight partitions. By Property 2.4, vr* is tight. By Claim 3.3, for any K with xk > 0, we have 
rc^* = \K\ - 1. Thus, r(7r*) - 1 = X]i<:e/C ^k^c]^ = Z^A'e/c ^k{\K\ - 1) > r(7f) - 1. But since vf is 
the unique maximal-rank partition, this implies vr* = vf. Thus vf is tight. This implies x G (V). □ 

3.2 Partition and Subtour Elimination Relaxations 
Theorem 4. The feasible regions of {V') and {S) are the same. 

Proof. Let x be any feasible solution to the LP (5). Note that the equality constraint of iV') is 
the same as that of {S). We now show that x satisfies (5). Fix a partition vr = {vri, . . . ,vrt}, so 
t = r['K). For each 1 < i < t, subtract the inequality constraint in (5) with S = vr^, from the 
equality constraint in {S) to obtain 

i t 

xk{p{K) - Y,P{K niT,)) > p{R) - Y,pM- (17) 

K&K. i=l 1=1 

From Lemma 2.9, p{K) - YlUi p{K n vr^) = rc^. We also have p{R) - Y.\=i pi^^i) = 1^1 - 1 - 
{\R\ — r(vr)) = r(vr) — 1. Thus x is a feasible solution to the LP (V). 

Now, let X be a feasible solution to {V') and it suffices to show that it satisfies the inequality 
constraints of (5). Fix a set S C R. Note when S = that inequality constraint is vacuously true 
so we may assume 5 7^ 0. Let R\S = {ri, . . . , r^}. Consider the partition vr = {{j"!}, . . . , {ru}, S}. 
Subtract (5) for this vr from the equality constraint in (V), to obtain 

XK{p{K) - rc^) < p{R) - r(^) + 1. (18) 

Using Lemma 2.9 and the fact that p{K n {rj}) = (the set is either empty or a singleton), we 
get p{K) - rc\ = p{K n S). Finally, as p{R) - r(vr) + 1 = |i?| - 1 - {\R\S\ + 1) + 1 = p{S), the 
inequality (18) is the same as the constraint needed. Thus a; is a feasible solution to (5), proving 
the theorem. □ 

3.3 Partition and Bidirected Cut Relaxations in Quasibipartite Instances 

Theorem 5. On quasibipartite Steiner tree instances, OPT (6) > OPT(P). 

To prove Theorem 5, we look at the duals of the two LPs and we show OPT (6_d) > OPT {T>£,) 
in quasibipartite instances. Recall that the support of a solution to {Vd) is the family of sets with 
positive zu- A family of sets is called laminar if for any two of its sets A, B we have A B^B A, 
01 Ar\B = 0. 
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Lemma 3.4. There exists an optimal solution to (T>d) whose support is a laminar family of sets. 



Proof. Choose an optimal solution z to {'Djj) which maximizes X][/-zc/|C^P among all optimal so- 
lutions. We claim that the support of this solution is laminar. Suppose not and there exists U and 
U' with U n U' ^ and zu > and zu' > 0. Define z' to be the same as z except z'jj = zu — 6, 
z'jj, = ziji — 5, z'jjijfj, = ZfjuU' + ^ and z^^^, = zunU' + ^; we will show for small 6 > 0, z' is feasible. 
Note that U Ci U' is not empty and U L) U' doesn't contain r, and the objective value remains 
unchanged. Also note that for any K and i £ K, zjjvju' or zur\U' appears in the summand of a 
constraint, then at least one of zu or zjji also appears. If both zjjyjjjt and zuf^jji appears, then both 
Zu and zijt appears. Thus z' is an optimal solution and "^Zu ^'u\^\'^ ^ Z^cz-^'^l^l^' contradicting 
the choice of z. □ 

Lemma 3.5. For quasibipartite instances, given a solution of (T>£)) with laminar support, we can 
get a feasible solution to {Bd) of the same value. 

Proof. This lemma is the heart of the theorem, and is a little technical to prove. We first give a 
sketch of how we convert a feasible solution z of {T>o) into a feasible solution to {Bd) of the same 
value. 

Comparing {T>d) and {Bd) one first notes that the former has a variable for every valid subset 
of the terminals, while the latter assigns values to all valid subsets of the entire vertex set. We 
say that an edge uv is satisfied for a candidate solution z, if both a) Ylu-ui^u v^u < Cuv and b) 
Ylu-v& u<^u -^u < Cuv hold; z is then feasible for {Bd) if M edges are satisfied. 

Let z be a feasible solution to iT>D)- One easily verifies that all terminal-terminal edges are 
satisfied. On the other hand, terminal-Steiner edges may initially not be satisfied. To see this 
consider the Steiner vertex v and its neighbours depicted in Figure 8 on page 14 below. Initially, 
none of the sets in z's support contains and the load on the edges incident to v is quite skewed: 
the left-hand side of condition a) above may be large, while the left-hand side of condition b) is 
initially 0. 

To construct a valid solution for {Bd), we therefore lift the initial value zs of each terminal 
subset S to supersets of 5, by adding Steiner vertices. The lifting procedure processes each Steiner 
vertex v one at a time; when processing u, we change z by moving dual from some sets U to 
UU{v}. Such a dual transfer decreases the left-hand side of condition a) for edge uv, and increases 
the (initially 0) left-hand sides of condition b) for edges connecting v to neighbours other than v. 

We will soon see that there is a way of carefully lifting duals around v that ensures that all 
edges incident to v become satisfied. The definition of our procedure will ensure that these edges 
remain satisfied for the rest of the lifting procedure. Since there are no Steiner-Steiner edges, all 
edges will be satisfied once all Steiner vertices are processed. 

Throughout the lifting procedure, we will maintain that z 
remains unchanged, when projected to the terminals. Formally, 
we maintain the following crucial projection invariant: 



The quantity Ec/:5cc/c5u{V\R) 

remains constant, for all terminal sets S. 



(PI) 



This invariant leads to two observations: first, the constraint (4) 
is satisfied by z at all times, even when it is defined on subsets 
of all vertices; second, 'Ylucv ■^u is constant throughout, and 
the objective value of z in (Bd) is not affected by the lifting. 
The existence of a lifting of duals around Steiner vertex v such 



























Figure 8: Lifting 


variable zij. 
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that (PI) is maintained, and such that all edges incident to v are satisfied can be phrased as a 
feasibility problem for a linear system of inequalities. We will use Farkas' lemma and the feasibility 
of z for (4) to complete the proof. 

We now fill in the proof details. Let T{v) denote the set of neighbours of vertex v in the given 
graph G. In each iteration, where we process Steiner node v, let 

:={U :zu>0 and U H r{v) / 0} 

be the sets in z's support that contain neighbours of v. Note that U £ could contain Steiner 
vertices on which the lifting procedure has already taken place. However, by (PI) and by Lemma 
3.4 the multi-family {UCiR : U G is laminar. In the lifting process, we will transfer xjj units of 
the zu units of dual of each set U gU^ to the set U' = U U {v}; this decreases the dual load (LHS 
of (2)) on arcs from U H T{v) to v (e.g. uv in Figure 8 on page 14) and increases the dual load on 
arcs from v to T{v)\U (e.g. vu' in the figure). The following system of inequalities describes the 
set of feasible liftings. 



\/U eUy-. xu < zu (Ll) 
\/u £ T{v) : i^U - Xu) < Cuv (L2) 

U:u£U 

Xu ^ Cuv i^"^) 

U:u(^U 

Claim 3.6. // (Ll), (L2), (L3) have a feasible solution a; > 0, then the lifting procedure can be 
performed at Steiner vertex v, while maintaining the projection invariant property. 

Proof. Define the new solution to be zu '■= zu — xu, and, z^^uuv) ■= xu, for all U £ Uy, and zu 
remains unchanged for all other U . It is easy to check that all edges which were satisfied remain 
satisfied, and (L2) and (L3) imply that all edges incident to v are satisfied. Also note that the 
projection invariant property is maintained. □ 

By Farkas' lemma, if (Ll), (L2), (L3) do not have a feasible solution x > 0, then there exist 
non- negative multipliers — \u for all U ^U^, and au, f3u for all u £T{v) — satisfying the following 
dual set of linear inequalities: 

^uzu+ au{cuv- 'Y zu) + ^ f3uCuv < (Dl) 

yU GUv-.Xu-Y^au + Y^Pu > (D2) 

As a technicality, note that the sub-system {(Ll), (L2),x > 0} is feasible — take x = z. Thus 
any a,/3,A satisfying (Dl) and (D2) has Ylul^u > dividing all a,/3,A by Ylif^ii 

assume without loss of generality that 

/3« = 1- (D3) 

lier(i') 

Subtracting (D3) from (D2) allows us to rewrite the latter set of constraints conveniently as 

yUGU,: Xu- J^(a„ + /3„) + 1 > 0. (D2') 
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The following claim shows that (LI), (L2), (L3) does have a feasible solution, and thus by Claim 
3.6, lifting can be done, which completes the proof of Lemma 3.5. 

Claim 3.7. There exists no feasible solution to {a,/3, A > : (Dl), (D2'), and (D3)}. 

Proof. Consider the linear program which minimizes the LHS of (Dl) subject to the constraints 
(D2') and (D3). We show that the LP has value at least 0, which will complete the proof. 

Let (A*, a*, /3*) be an optimal solution to the LP. In Lemma 3.8 we will show that the constraint 
matrix of the LP is totally unimodular; hence, since the right-hand side of the given system is 
integral, we may assume that A*, a*, and /3* are non-negative and integral. From (D3) we infer 

There is a unique u S T{v) for which = 1; for all u ^ u, (3* = 0. (19) 

Moreover, since each Xjj appears only in the two constraints (D2') and Xjj > 0, and since Xjj has 
nonnegative coefficient in the objective, we may assume 

A*^ = Xlj{a*,n := max{^(a: + /?:) - 1,0} (20) 

for all U. 

Next, we establish the following: 

a; + /3:e{0,l} forall^/Gr(?;). (21) 

Suppose for the sake of contradiction that property (21) does not hold for our solution. Let u be 
such that al^ + 13* > 2. By (19), a* > 1. We propose the following update to our solution: decrease 
a* by 1 (which by (20) will decrease A^ by 1 for all U G U^j). This maintains the feasibility of 
(D2'), and the objective value decreases by 

^ ZU + {Cuv - ^ zu) 

ueUv-ueu ueu 

which is non-negative as c > 0. By repeating this operation, we may clearly ensure property (21). 

Let K C T{v) be the set {u | a* -|- /3* = 1} and recall u is the unique terminal with /3i = 1; -u 
is clearly a member of K. At (q*,/3*,A*), we evaluate the objective and collect like terms to get 
value 

J2 zupiUnK)+ J2 (c™- Yl zu) + Cuv = Ycuv+J2'^uipiUnK)-\iK\u)nU\) 

U&Uv u&K\u U:ueU ueK UgUv 

= 'Ycuv- ^ zu 

ueK ueUy:UnKy^0,u^u 

where the last equality follows by considering cases. Finally, combining the fact that Xlue-ft' '^"»' — 
Ck (since these edges form one possible full component on terminal set K) together with (4) for 
the pair {K,u), it follows that the LP's optimal value is non-negative as needed. □ 

Lemma 3.8. The incidence matrix defined by (D2') and (D3) is totally unimodular. 

Proof. The incidence matrix has \Uv \ + 1 rows {\Uy \ corresponding to (D2') and one last row corre- 
sponding to (D3)) and \Uv\ + 2\T{v)\ columns. Furthermore, the columns corresponding to a^s are 
same as those corresponding to /3uS, except for the last row, where there are O's in the o-columns 
and I's in the /3-columns. 

To show that this matrix is totally unimodular we use Ghouila-Houri's characterization of total 
unimodularity (e.g. see [30, Thm. 19.3]): 
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Theorem 6 (Ghouila-Houri 1962). A matrix is totally unimodular iff the following holds for every 
subset TZ of rows: we can assign weights Wr G {—1) +1} to each row r ^ TZ such that X^^gt?.^^^ 
a {0, ±l}-vector. 

Note that we can safely ignore the columns corresponding to variables Xu for sets U ^U^, since 
each of them contains a single 1 occurring in constraint (D2') for set U. 

The row subset TZ corresponds to a subset of Uy — which we wih denote TZ nU^ — plus 
possibly the single row corresponding to (D3). Each row in TZDU^ has its values determined by 
the characteristic vector of U r]T{v). So long as any set appears more than once in {Ur]T{v) \ U E 
TZnUy} we can assign one copy weight +1 and the other copy weight —1; these rows cancel out. 
Thus, henceforth we assume {U n T{v) \ U G TZCiUy} has no duplicate sets. 

There is a standard representation of a laminar family as a forest of rooted trees, where there 
is a node corresponding to each set, with containment in the family corresponding to ancestry in 
the forest. Given the forest for the laminar family {U H T(v) \ U £ TZ Uy}, the assignment of 
weights to the rows of the matrix is as follows. Let the root nodes of all trees be at height with 
height increasing as one goes to children nodes. Give weight —1 to rows corresponding to nodes at 
even height, and weight +1 to rows corresponding to nodes at odd height. If TZ contains the row 
corresponding to (D3), give it weight +1. 

Finally, let us argue that these weights have the needed property. Consider first a column 
corresponding to for any u. The rows of TZ with 1 in this column form a path, from the largest 
set containing u (which is a root node) to the smallest set containing u. The weighted sum in this 
column is an alternating sum —1 + 1 — 1 + 1 • • • , which is either —1 or 0, which is in {0, ±1} as needed. 
Second, in a column for some /3„, if TZ doesn't contain (resp. contains) the row corresponding to 
(D3), the weighted sum is the same as for (resp. plus 1); in either case its weighted sum is in 
{0,±1} as needed. □ 

This finishes the proof of Lemma 3.5, and hence also that of Theorem 5. □ 

4 Improved Integrality Gap Upper Bounds 

We first show the improved bound of 73/60 for uniformly quasibipartite graphs. We then show the 
(2^2 - 1) + 1.828 upper bound on general graphs, which contains the main ideas, and then end 
by giving a V3 = 1.729 upper bound. 

4.1 Uniformly Quasibipartite Instances 

Uniformly quasibipartite instances of the Steiner tree problem are quasibipartite graphs where the 
cost of edges incident on a Steiner vertex are the same. They were first studied by Gropl et al. [20], 
who gave a 73/60 factor approximation algorithm. In the following, we show that the cost of the 
returned tree is no more than than g OPT (P), which upper-bounds the integrality gap by ^. 

We start by describing the algorithm of Gropl et al. [20] in terms of full components. A collection 
K,' of full components is acyclic if there is no list of t > 1 distinct terminals and hyperedges in fC' 
of the form ri G Ki B r2 £ K2 ■ ■ ■ B rt £ Kf B ri — i.e. there are no hypercycles. 
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Procedure RatioGreedy 
1: Initialize the set of acyclic components >C to 0. 

2: Let L* be a minimizer of over all full components L such that \L\ > 2 and L U £ is 

acyclic. 
3: Add L* to C. 

4: Continue until {R,C) is a hyper-spanning tree and return C. 



Theorem 7. On a uniformly quasibipartite instance RatioGreedy returns a Steiner tree of cost 
at most |§OPT {V). 

Proof. Let t denote the number of iterations and C := be the ordered sequence of 

full components obtained. We now define a dual solution to (Vd)- Let 7r(i) denote the partition 
induced by the connected components of {Li, . . . , Lj}. Let 6{i) denote CL./(|Lj| — 1) and note that 
6 is nondecreasing. Define ^(0) = for convenience. We define a dual solution y with 

y^ii) = 0{i + 1) - eii) 

for < i < t, and all other coordinates of y set to zero; y is not generally feasible, but we 
will scale it down to make it so. By evaluating a telescoping sum, it is not hard to find that 
Z^i yn{i){^{'^ {''')) ~ 1) = C{C). In the rest of the proof we will show for any K € JC, y^(j)rc^^*'' < 
73/60 • Ck — by scaling, this also proves that feasible dual solution, and hence completes 

the proof. 

Fix any K £ IC and let \K\ = k. Since the instance in question is uniformly quasi-bipartite, 
the full component K is a star with a Steiner centre and edges of a fixed cost c to each terminal in 
K. For 1 < z < fc, let r(i) denote the last iteration j in which ^c^''^ > k — i. Let Ki denote any 
subset of K of size k — i + 1 such that Ki contains at most one element from each part of 7r(r(z)); 
i.e., \Ki\ = k — i + 1 and rc^'^^^^^ = k — i. 

Our analysis hinges on the fact that Ki was a valid choice for L^(j)^x. More specifically, note 
that {Li, . . . ,L^(j),iCj} is acyclic, hence by the greedy nature of the algorithm, for any 1 < i < A;, 

e{T{i) + 1) = Ci^„^,/(|L,(,)+i| - 1) < CKj{\Ki\ - 1) < ' ' . 
Moreover, using the definition of r and telescoping we compute 

J2 y^rc^ = + 1) - e(i))rc^« = ^(^(^) + 1) < E ' " %~_\^ =c-{k-l + Hik-l)), 

TT j=0 1=1 j=l 

where H{-) denotes the harmonic series. Finally, note that (A; — 1 + H{k — 1)) < ^k for all > 2 
(achieved at k = 5). Therefore, is a valid solution to (Vd)- D 

4.2 General graphs 

We start with a few definitions and notations in order to prove the 2\/2 — 1 and -v/3 integrality 
gap bounds on {V). Both results use similar algorithms, and the latter is a more complex version 
of the former. For conciseness we let a "graph" be a triple G = {V, E, R) where R C V are G's 
terminals. In the following, we let mtst(G;c) denote the minimum terminal spanning tree, i.e. the 
minimum spanning tree of the terminal- induced subgraph G[R] under edge-costs c : — )• R. We 
will abuse notation and let iiitst(G;c) mean both the tree and its cost under c. 
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When contracting an edge uv in a graph, the new merged node resulting from contraction is 
defined to be a terminal iff at least one of n or u was a terminal; this is natural since a Steiner tree 
in the new graph is a minimal set of edges which, together with uv, connects all terminals in the 
old graph. Our algorithm performs contraction, which may introduce parallel edges, but one may 
delete all but the cheapest edge from each parallel class without affecting the analysis. 

Our first algorithm proceeds in stages. In each stage we apply the operation G i— >• G/K 
which denotes contracting all edges in some full component K. To describe and analyze the 
algorithm we introduce some notation. For a minimum terminal spanning tree T = iiitst(G;c) 
define droprp[K; c) := c(T) — nitst(G/i^; c). We also define gdiiiLrp^K; c) := dropj.(iC) — c{K), 
where c{K) is the cost of full component K. A tree T is called gainless if for every full component 
K we have gain2-(K;c) < 0. The following useful fact is implicit in [23] (see also Appendix B). 

Theorem 8 (Implicit in [23]). //mtst(G; c) is gainless, then OPT (V) equals the cost ofmtst{G; c). 

We now give the first algorithm and its analysis, which uses a reduced cost trick introduced by 
Chakrabarty et al.[4]. 

Procedure Reduced One-Pass Heuristic 
1: Define costs Cg by Cg := Ce/\/2 for all terminal-terminal edges e, and Cg = Cg for all other 

edges. Let Gi := G, Ti := mtst(Gj;c'), and i := 1. 
2: The algorithm considers the full components in any order. When we examine a full component 

K, if gaL±iirp,(K; c') > 0, let Ki := K, Gj+i := Gi/Ki, Tj+i := mtst(Gj+i; c'), and i ■.= i + \. 
3: Let / be the final value of i. Return the tree Taig := Tf U IJi=i ^i- 

Note that the full components are scanned in any order and they are not examined a priori. Hence 
the algorithm works just as well if the full components arrive "online," which might be useful for 
some applications. 

Theorem 9. c{Taig) < (2^/2 - 1) OPT (P). 

Proof. First we claim that gaiiij. ,(i^; c') < for all K. To see this there are two cases. If = i^j 
for some i, then we immediately see that drop-p^. (iT) = for all j > i so gain^^ (K) = —c{K) < 0. 
Otherwise (if for all i, K ^ Ki) K had nonpositive gain when examined by the algorithm; and 
the well-known contraction lemma (e.g., see [19, §1.5]) immediately implies that gain^^. (ET) is 
nonincreasing in i, so gainji^(Er) < 0. 

By Theorem 8, c'{Tf) equals the value of (V) on the graph Gf with costs c'. Since c' < c, and 
since at each step we only contract terminals, the value of this optimum must be at most OPT (V). 
Using the fact that c{Tf) = y/2c'{Tf), we get 

c{Tf) = V2c{Tf) < \/2 0PT(7') (22) 

Furthermore, for every i we have gain-^^ (Kj; c') > 0, that is, drop^^ (i^j; c') > c'{K) = c{K). 
The equality follows since K contains no terminal-terminal edges. However, droprp.{Ki;c') = 
-^dropy. (i^j; c) because all edges of Tj are terminal-terminal. Thus, we get for every i = 1 to 

/, dropy^(Ki;c) > V2-c{Ki). 

Since dropy.(i^'j; c) := mtst(Gi; c) — mtst(Gi+i; c), we have 

/-I 

dropy. (Kj ; c) =mtst(G;c) — c{Tf). 

i=l 
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Thus, we have 

/—I / 

5^c(K,) < ^^dropy^(K,;c) = ^(mtst(G;c) -c(r;)) < (2 OPT (P) - c(T/)) 
i=i i=i ^2 V2 

where we use the fact that intst(G, c) is at most twice OPT {V)^. Therefore 

/-I , 
c(r,,,) = c{Tf) + J2cilQ < (l - ^)c(T;) + V2 0PT(P). 

Finally, using c(r/) < V2OPT {V) from (22), the proof of Theorem 9 is complete. □ 
4.2.1 Improving to \/3 

To get the improved factor of -y/S, we use a more refined iterated contraction approach. The 
crucial new concept is that of the loss of a full component, introduced by Karpinski and Zelikovsky 
[22]. The intuition is as follows. In each iteration, the {2^/2 — l)-factor algorithm contracts a full 
component K, and thus commits to include K in the final solution; the new algorithm makes a 
smaller commitment, by contracting a subset of -RT's edges, which allows for a possibility of better 
recovery later. 

Given a full component K (viewed as a tree with leaf set K and internal Steiner nodes), 
loss(iC) is defined to be the minimum-cost subset of E{K) such that (V {K) ,\oss[K)) has at 
least one terminal per connected component — i.e. the cheapest way in K to connect each Steiner 
node to the terminal set. We also use loss(i^) to denote the total cost of these edges. Note that 
no two terminals are connected by loss(i^). A very useful theorem of Karpinski and Zelikovsky 
[22] is that for any full component ET, loss(i^') < c{K)/2. 

Now we have the ingredients to give our new algorithm. In the description below, a > 1 is a 
parameter (which will be set to ^/2>). In each iteration, the algorithm contracts the loss of a single 
full component K (we note it follows that the terminal set has constant size over all iterations). 

Procedure Reduced One-Pass Loss-Contracting Heuristic 
1: Initially Gi := G, Ti := mtst(G;c), and i := 1. 

2: The algorithm considers the full components in any order. When we examine a full component 
K, if 

gainj.. (K;c) > (a — l)loss(K), 

let Ki := K, Gj+i := Gi/loss{Ki), Tj+i := iiitst(Gi+i; c), and i := i + 1. 
3: Let / be the final value of i. Return the tree Taig := Tf U IJi=i loss(i^^j). 

We now analyze the algorithm. 
Claim 4.1. c{Tf) < {^)OFT{V). 

Proof. Using the contraction lemma again, gain^^ (i^;c) < (a — l)loss(i^) for all K, so 

drop^^ (i^;c) < c{K) + (a - l)loss(K) = c{K) + {a - l)loss(i^) < (^i^)c(i^) (23) 

^This follows using standard arguments, and can be seen, for instance, by applying Theorem 8 to the cost-function 
with all terminal-terminal costs divided by 2, and using short-cutting. 
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since loss(iv:) < c[K)/2. 

To finish the proof of Claim 4.1, we proceed as in the proof of Equation (22). Define Cg := 
Ce/(^-^) for all edges e which join two vertices of the original terminal set R, and Cg = Ce for all 
other edges. Note that (23) implies that Tf is gainless with respect to c'. Thus, by Theorem 8, the 
value of LP ("P) on {Gf,c') equals c'(Tf). Since we only reduce costs (as a > 1), this optimum is 
no more than the original OPT (V) giving us c'{Tf) < OPT (V). Now using the definition of c', the 
proof of the claim is complete. □ 

Claim 4.2. For any i > 1, we have c{Ti) — c(Tj+i) > gainy.(i<'j; c) + loss(i^j). 

Proof. Recall that Tj+i is a minimum terminal spanning tree of Gj+i under c. Consider the following 
other terminal spanning tree T of Gj+i: take T to be the union of Ki/\oss[Ki) with T!itst{Gi/ Ki] c). 
Hence c(rj_|_i) < c(T) = mtst[Gi/ K^; c) + c{Ki) — loss(i^j). Rearranging, and using the definition 
of gain, we obtain: 

c{Ti) - c(rj+i) > c{Ti) - mtst{Gi/Ki;c) - c{Ki) + loss(A'j) = gain^. (/Cj ; c) + loss(i^j), 
and this completes the proof. □ 
Now we are ready to prove the integrality gap upper bound of -v/3. 
Theorem 10. c{Taig) < V30FT{V). 

Proof. By the algorithm, we have for all i that gain^. (Ki) > (a—l)loss{Ki) , and thus gain^.. (Ki; c) + 
loss(Ki) > aloss{Ki). Thus, from Claim 4.2, we get 

/-I , /-I 



a 

1=1 i=l 



The right-hand sum telescopes to give us c(Ti) — c{Tf) = nitst(G; c) — ciTj). Thus, 

/-I J ^ 
c{Taig) = c{Tf) + J2 loss(i^i) < c{Tf) + -(mtst(G; c) - c{Tf)) = -mtst(G; c) + c{Tf) 

i=l 

(^^^I)(i±i^) OPT (P) = (^) OPT (P) 
\a 2a / V 2a / 

which follows from mtst(G;c) < 2 0PT('P) and Claim 4.1. Setting a = \/3, the proof of the 
theorem is complete. □ 



5 Conclusion 

In this paper we looked at several hypergraphic LP relaxations for the Steiner tree problem, and 
showed they all have the same objective value. Furthermore, we noted some connections to the 
bidirected cut relaxation for Steiner trees: although hypergraphic relaxations are stronger than the 
bidirected cut relaxation in general, in quasibipartite graphs all these relaxations are equivalent. We 
obtained structural results about the hypergraphic relaxations showing that basic feasible solutions 
have sparse support. We also showed improved upper bounds on the integrality gaps on the 
hypergraphic relaxations via simple algorithms. 

Reiterating the comments in Section 1.2.3, the hypergraphic LPs are powerful (e.g. as evidenced 
by Byrka et al. [3]) but may not be manageable for computational implementation. Some interesting 
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areas for future work include: non-ellipsoid-based algorithms to solve the liypcrgraphic LPs in 
the r-restricted setting; resolving the complexity of optimizing them in the unrestricted setting; 
and directly using the bidirected cut relaxation to achieve good results (e.g. in quasi-bipartite 
instances) . 
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A Directed Hypergraph LP Relaxation 

Theorem 11. For any Steiner tree instance, OPT('P) = OPT(P). 

Proof. First, we show OFT (V) < OPT('D). Consider a feasible solution x to (V), and define a 
solution x' to (V) by x'j^ = YlieK -^^k^j informally, x' is obtained from x by ignoring the orientation 
of the hyperedges. Clearly x' and x have the same objective value. Further, x' is feasible for 
(V); to see this, for any partition tt, note that (5) is implied by the sum of constraints (3) over U 
set to those parts of vr not containing the root — any orientation of a full component with rank 
contribution t must leave at least t parts. 

To obtain the reverse direction OPT (P) < OPT('P), we use a similar strategy. We require 
some notation and a hypergraph orientation theorem of Frank et al. [14]. For any U G R we say 
that a directed hyperedge lies in A'°([/) ii i G U and K\U / 0, i.e. if £ A°"*(i?\C/). Two 
subsets U and W oi R are called crossing if all four sets U \ W , W \ U , U CiW , and R\{U U W) 
are non-empty. A set-function p : 2^ — )• Z is a crossing supermodular function if 

p{U) + p{W) < p{U nW)+ p{U U W) 

for all crossing sets U and W. A directed hypergraph is said to cover p if |A™(C/)| > p{U) for all 
U C R. Here is the needed result. 

Theorem 12 (Frank, Kiraly &: Kiraly [14]). Given a hypergraph H = {R,X), and a crossing 
supermodular function p, the hypergraph has an orientation covering p if and only if for every 
partition vr of R, 

(a) ExeA'minil.rcJ-} > E,r,e7r ".nd, (b) Y^Kex^^K > E7r,e7rP(^ \ ^i)- 

We will show every rational solution x to (P) can be fractionally oriented to get a feasible 
solution for (P), which will complete the proof of Theorem 11. Let M be the smallest integer such 
that the vector Mx is integral. Let A' be a multi-set of hyperedges which contains Mxk copies of 
each K. Define the function p by p{U) = M if r £ U ^ R, and p{U) = otherwise; i.e. p{U) = M 
iff R\U is vahd. 

Claim A.l. H = (R,X) satisfies conditions (a) and (h). 

Proof. Note Xlyr 67r^'(^\^*) ~ M{r{TT) — 1) since all parts of vr are valid except the part containing 
the root r. Thus condition (b), upon scaling by -p, is a restatement of constraint (5), which holds 
since x is feasible for {V). 

For this p, condition (a) follows from (b) in the following sense. Fix a partition vr, and let tti 
be the part of vr containing r. If vri = i? then (a) is vacuously true, so assume vri ^ R. Let a be 
the rank-2 partition {7ri,i?\ tti}. Then it is easy to check that min{l,rc^} > rc^ for all K, and 
consequently ^^^^^ min{l, rc^,} > Y^Kex^^K and E,r,e<7^'(^ \ ^0 = M = Y^^enPi'^i)- Thus, 
(a) for TT follows from (b) for a. □ 
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It is not hard to check that p is crossing supermodular. Now using Theorem 12, take an 
orientation of X that covers p. 

For each K £ IC and each i £ K, let rif^i denote the number of the Mxk copies of K that are 
oriented as K^, i.e. directed towards i. So, ^i^K^K^ ~ M^K- Let x'^^i := for ah K^. Hence 
J2i x'xi ~ and x' has the same objective value as x. 

To complete the proof, we show x' is feasible for (V) . Fix a valid subset U and consider condition 
(3) for a valid set U. Note that p{R\U) = M. Therefore, since the orientation covers p, we get 

Ex', = "S^ nK^ = nT^,> ^piR\U) = -^-M = 1 

^ M ^ ^ M ^ - M^^ ^ ' M 

XieAout((7) KieA°^^[U) Ki<^/S}^{R\U) 

as needed. □ 



B Gainless MSTs and Hypergraphic Relaxations 

Theorem 8 (Implicit in [23]). // the MST induced by the terminals is gainless, then OPT (P) 
equals the cost of that MST. 

Proof. Let 11 be the set of all partitions of the terminal set. As before, we let r(7r) be the rank of 
a partition vr S 11, and we use E^^ for the set of edges in our graph that cross the partition; i.e., E.,^ 
contains all edges whose endpoints lie in different parts of vr. Fulkerson's [15] formulation of the 
spanning tree polyhedron and its dual are as follows. 



min I CeXe ■ x G R>o 



(M) 



^ Xe> r(7r) - 1 Vtt E n| (24) 



max 



{ j;(r(^) - 1) • : y G R^q (Md) 

TT 

J2 y-<Ce, VeG^} (25) 



The high-level overview of the proof is as follows. We first give a brief sketch of a folklore primal- 
dual interpretation of Kruskal's minimum-spanning tree algorithm with respect to Fulkerson's LP 
(for more information see, e.g., [23]). Running Kruskal's algorithm on the terminal set then returns 
a minimum spanning tree T and a feasible dual y to Equation (Md) such that 

c(r) = ^(r(7r) - l)y^. 



The final step will be to show that, if the returned MST is gainless, then the spanning tree dual y 
is feasible for (Vd), and its value is c(T) as well. Weak duality and the fact that the optimal value 
of (V) is at most c(T) imply the theorem. 

Kruskal's algorithm can be viewed as a process over time. For each time r > 0, the algorithm 
keeps a forest T"^, and a feasible dual solution y"^; initially = {V, 0) and y" = 0. Let vr"^ be the 
partition induced by the connected components of T"^. If T'^ is not a spanning tree, Kruskal's algo- 
rithm grows the dual variable Ht^t corresponding to the current partition until constraint Equation 
(Md)^'- for some edge e prevents any further increase. The algorithm then adds e to the partial 
tree and continues. The algorithm stops at the first time r* where T"^ is a spanning tree. 

Let T be the gainless spanning tree returned by Kruskal, and let y be the corresponding dual. 
We claim that y is feasible for (Vd)- To see this, consider a full component K. Clearly, the rank 
contribution rc^ of K to the initial partition tt^ is \K\ — 1; similarly, the final rank contribution 
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rc^ is 0. Every edge that is added during the algorithm's run either leaves the rank contribution 
of K unchanged, or it decreases it by 1. Let ei, . . . , e\K\-i be the edges of the final tree T whose 
addition to T decreases K^s rank contribution. Also let 



< n < T2 < . . . < < T* 

be the times where these edges are added. Note that, by definition, we must have Cg, = Tj for all i. 
We therefore have 

|iC|-l \K\-1 

E = E (26) 

1=1 i=l 

The right-hand side of this equality is easily checked to be equal to 

rc^ (It, 





which in turn is equal to rc^y^, by the definition of Kruskal's algorithm. It is not hard to see 
that the left-hand side of (26) is the drop drop'p(i^) induced by K. Together with the fact that T 
is gainless, we obtain 

CK > dropy(i^) = E ^^kVtt- 

TV 

Now observe that the right-hand side of this equation is the left-hand side of (6). It follows that y 
is feasible ior (Vd)- D 
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